Goto

Collaborating Authors

 label propagation algorithm




Individual Bus Trip Chain Prediction and Pattern Identification Considering Similarities

Huang, Xiannan, Chen, Yixin, Yuan, Quan, Yang, Chao

arXiv.org Artificial Intelligence

Predicting future bus trip chains for an existing user is of great significance for operators of public transit systems. Existing methods always treat this task as a time-series prediction problem, but the 1-dimensional time series structure cannot express the complex relationship between trips. To better capture the inherent patterns in bus travel behavior, this paper proposes a novel approach that synthesizes future bus trip chains based on those from similar days. Key similarity patterns are defined and tested using real-world data, and a similarity function is then developed to capture these patterns. Afterwards, a graph is constructed where each day is represented as a node and edge weight reflects the similarity between days. Besides, the trips on a given day can be regarded as labels for each node, transferring the bus trip chain prediction problem to a semi-supervised classification problem on a graph. To address this, we propose several methods and validate them on a real-world dataset of 10000 bus users, achieving state-of-the-art prediction results. Analyzing the parameters of similarity function reveals some interesting bus usage patterns, allowing us can to cluster bus users into three types: repeat-dominated, evolve-dominate and repeat-evolve balanced. In summary, our work demonstrates the effectiveness of similarity-based prediction for bus trip chains and provides a new perspective for analyzing individual bus travel patterns. The code for our prediction model is publicly available.


Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals

Macabiau, Clara, Le, Thanh-Dung, Albert, Kevin, Jouvet, Philippe, Noumeir, Rita

arXiv.org Artificial Intelligence

Photoplethysmogram (PPG) signals are widely used in healthcare for monitoring vital signs, but they are susceptible to motion artifacts that can lead to inaccurate interpretations. In this study, the use of label propagation techniques to propagate labels among PPG samples is explored, particularly in imbalanced class scenarios where clean PPG samples are significantly outnumbered by artifact-contaminated samples. With a precision of 91%, a recall of 90% and an F1 score of 90% for the class without artifacts, the results demonstrate its effectiveness in labeling a medical dataset, even when clean samples are rare. For the classification of artifacts our study compares supervised classifiers such as conventional classifiers and neural networks (MLP, Transformers, FCN) with the semi-supervised label propagation algorithm. With a precision of 89%, a recall of 95% and an F1 score of 92%, the KNN supervised model gives good results, but the semi-supervised algorithm performs better in detecting artifacts. The findings suggest that the semi-supervised algorithm label propagation hold promise for artifact detection in PPG signals, which can enhance the reliability of PPG-based health monitoring systems in real-world applications.


Differential equation and probability inspired graph neural networks for latent variable learning

Shi, Zhuangwei

arXiv.org Artificial Intelligence

Probabilistic theory and differential equation are powerful tools for the interpretability and guidance of the design of machine learning models, especially for illuminating the mathematical motivation of learning latent variable from observation. Subspace learning maps high-dimensional features on low-dimensional subspace to capture efficient representation. Graphs are widely applied for modeling latent variable learning problems, and graph neural networks implement deep learning architectures on graphs. Inspired by probabilistic theory and differential equations, this paper conducts notes and proposals about graph neural networks to solve subspace learning problems by variational inference and differential equation. Source code of this paper is available at https://github.com/zshicode/Latent-variable-GNN.


Incremental Semi-Supervised Learning Through Optimal Transport

Hamri, Mourad El, Bennani, Younès

arXiv.org Machine Learning

Semi-supervised learning has recently emerged as one of the most promising paradigms to mitigate the reliance of deep learning on huge amounts of labeled data, especially in learning tasks where it is costly to collect annotated data. This is best illustrated in medicine, where measurement require overpriced machinery and labels are the result of an expensive human assisted time-consuming analysis. Semi-supervised learning (SSL) aims to largely reduce the need for massive labeled datasets by allowing a model to leverage both labeled and unlabeled data. Among the many semi-supervised learning approaches, graph-based semi-supervised learning techniques are increasingly being studied due to their performance and to more and more real graph datasets. The problem is to predict all the unlabelled vertices in the graph based on only a small subset of vertices being observed. To date, a number of graph-based algorithms, in particular label propagation methods have been successfully applied to different fields, such as social network analysis [7][50][51][25], natural language processing [1][43][3], and image segmentation [47][10]. The performance of label propagation algorithms is often affected by the graph-construction method and the technique of inferring pseudo-labels.


Learning in Graphs with Python (Part 3)

#artificialintelligence

Let's start with Link prediction! In Link Prediction, given a graph G, we aim to predict new edges. Predictions are useful to predict future relations or missing edges when the graph is not fully observed for example, or when new customers join a platform (e.g. a new LinkedIn user). Link prediction for a new LinkedIn user would simply be a suggestion of people he might know. In link prediction, we simply try to build a similarity measure between pairs of nodes and link the most similar nodes.


Faster Support Vector Machines

Schlag, Sebastian, Schmitt, Matthias, Schulz, Christian

arXiv.org Machine Learning

The time complexity of support vector machines (SVMs) prohibits training on huge data sets with millions of samples. Recently, multilevel approaches to train SVMs have been developed to allow for time efficient training on huge data sets. While regular SVMs perform the entire training in one - time consuming - optimization step, multilevel SVMs first build a hierarchy of problems decreasing in size that resemble the original problem and then train an SVM model for each hierarchy level benefiting from the solved models of previous levels. We present a faster multilevel support vector machine that uses a label propagation algorithm to construct the problem hierarchy. Extensive experiments show that our new algorithm achieves speed-ups up to two orders of magnitude while having similar or better classification quality over state-of-the-art algorithms.